Control Variates for Stochastic Gradient MCMC
نویسندگان
چکیده
It is well known that Markov chain Monte Carlo (MCMC) methods scale poorly with dataset size. We compare the performance of two classes of methods which aim to solve this issue: stochastic gradient MCMC (SGMCMC), and divide and conquer methods. We find an SGMCMC method, stochastic gradient Langevin dynamics (SGLD) to be the most robust in these comparisons. This method makes use of a noisy estimate of the gradient of the log posterior, which significantly reduces the per iteration computational cost of the algorithm. We analyse the algorithm over different dataset sizes and show, despite the per iteration saving, the computational cost is still proportional to the dataset size. We use control variates, a method to reduce the variance in Monte Carlo estimates, to reduce this computational cost to O(1). Next we show that a different control variate technique, known as zero variance control variates can be applied to SGMCMC algorithms for free. This post-processing step improves the inference of the algorithm by reducing the variance of the MCMC output. Zero variance control variates rely on the gradient of the log posterior; we explore how the variance reduction is affected by replacing this with the noisy gradient estimate calculated by SGMCMC.
منابع مشابه
Notes on Using Control Variates for Estimation with Reversible MCMC Samplers
A general methodology is presented for the construction and effective use of control variates for reversible MCMC samplers. The values of the coefficients of the optimal linear combination of the control variates are computed, and adaptive, consistent MCMC estimators are derived for these optimal coefficients. All methodological and asymptotic arguments are rigorously justified. Numerous MCMC s...
متن کاملTracking the gradients using the Hessian: A new look at variance reducing stochastic methods
Our goal is to improve variance reducing stochastic methods through better control variates. We first propose a modification of SVRG which uses the Hessian to track gradients over time, rather than to recondition, increasing the correlation of the control variates and leading to faster theoretical convergence close to the optimum. We then propose accurate and computationally efficient approxima...
متن کاملDistributed Stochastic Gradient MCMC
Probabilistic inference on a big data scale is becoming increasingly relevant to both the machine learning and statistics communities. Here we introduce the first fully distributed MCMC algorithm based on stochastic gradients. We argue that stochastic gradient MCMC algorithms are particularly suited for distributed inference because individual chains can draw mini-batches from their local pool ...
متن کامل2 5 Ju n 20 15 Markov Interacting Importance Samplers
We introduce a new Markov chain Monte Carlo (MCMC) sampler called the Markov Interacting Importance Sampler (MIIS). The MIIS sampler uses conditional importance sampling (IS) approximations to jointly sample the current state of the Markov Chain and estimate conditional expectations, possibly by incorporating a full range of variance reduction techniques. We compute Rao-Blackwellized estimates ...
متن کاملStochastic Gradient MCMC with Stale Gradients
Stochastic gradient MCMC (SG-MCMC) has played an important role in largescale Bayesian learning, with well-developed theoretical convergence properties. In such applications of SG-MCMC, it is becoming increasingly popular to employ distributed systems, where stochastic gradients are computed based on some outdated parameters, yielding what are termed stale gradients. While stale gradients could...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1706.05439 شماره
صفحات -
تاریخ انتشار 2017